WeRateDogs is a popular Twitter account that post hilarious comments and rates of dogs from phots submitted by their fellowers. Active since November 2015, it has grown beyond being just a twitter account to launch even a store with dog-related products.
This prompted me to make a research on the WeRateDogs dataset. And it's no surprising that the insights from the dataset is interesting.
to understand the data, I take a look at the distribution of the dog rating. And most dog ratings are between 10 and 14.
with the basic understanding of the distribution of the dog ratings, I proceed to find out what are the most common dog breeds in the dataset, and the top 5 dog breeds. To achieve this, first, I take a look at the names of the dogs that appeared frequentely in the dataset using word cloud - the higher the frequency, the bolder the name becomes on the plotted image, as shown below:
from the word cloud, we see a bunch of dog breed name appearing bolder than many others. with the boldest being the Golden Retriever breed. followed by Labrador Retriever, Toy Poddle, Pembroke, to name a few. Also, for a better [granular] details, the top 5 breeds were ploted.
we see that the top 5 breeds with the highest likes (favorite count) are:
Next, let's we take a look at the top 5 dog breed with the highest likes; shown below:
Next, I sought to know if there exist a relationship between the likes (favorite count) and retweet count:
interestingly, as depicted by the straight line from the origin (0) to the upward left, fitting most of points of the scatter plot, a strong relationship exist between the favorite count (likes) and the retweet count. The correlation coefficient, which is a statistical text for relationship, between retweet_count and favorite_count is 0.93 - this is an indication that a strong relationship exists between retweet_count and favorite_count
Wouldn't it be insightful to know the dog stage that's more common? I though so too. after knowing the top 5 most common dog breeds in the dataset, as well as the top 5 dog breed witht the highest rating, out of curious, I sought out to find out, among the four (4) dog stages, which of the stage is more dominate in the dataset.
from the plot, we see that the pupper stage is the dog stage that got the highest rating, followed by doggo, puppo, and the least, floofer.